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Abstract 

We  investigate  the  problem  of  doing  post  mortem 
fault  isolation  for  concurrent  systems  using  a be- 
havioral model.  The  aim  is  to  isolate  the  action 
that  has  caused  the  failure  of  the  system,  the  root 
action.  The  naive  approach  would  be  to  say  that 
a certain  action  is  the  root  action  iff  it  is  a logical 
consequence  of  the  model  and  observations  that  the 
action  is  the  first  “bad  thing  to  happen”.  This,  how- 
ever, is  a strong  requirement  and  puts  high  demand 
on  the  model.  In  this  paper  we  describe  the  con- 
cept of  strong  root  candidate,  a relaxation  of  the 
naive  approach.  The  advantage  of  determining  the 
strong  root  candidate  directly  from  model  and  ob- 
servations is  that  the  set  of  traces  consistent  with 
model  and  observations  need  not  be  explicitly  com- 
puted. The  property  of  strong  root  candidate  can 
instead  be  determined  on-the-fly,  thus  only  comput- 
ing relevant  parts  of  the  reachable  state  space. 

1 Introduction 

In  this  paper  we  describe  a model-based  [Hamscher  et  al, 
1992]  approach  to  fault  isolation  in  object  oriented  control 
software.  The  work  is  motivated  by  a real  industrial  robot 
control  system  developed  by  ABB  Robotics.  The  system  is 
large  (the  order  of  10®  lines  of  code),  concurrent,  has  an  ob- 
ject oriented  architecture  and  is  highly  configurable,  support- 
ing different  types  of  robots  and  cell  configurations.  Since  the 
system  is  time-  and  safety-critical  the  first  priority,  in  case  of 
a failure,  is  to  bring  the  system  to  a safe  state;  alarms  that  go 
off  are  logged  and  can  be  analyzed  when  the  system  comes 
to  a stand-still.  The  faults  considered  are  primarily  hardware 
faults,  and  therefore  we  rely  on  the  assumption  that  the  failing 
hardware  has  some  software  counterpart  that  is  affected  by 
the  failure  of  the  hardware.  In  addition  we  make  the  common 
single  fault  assumption,  i.e.  that  a system  failure  is  caused  by 
only  one  fault  (but  resulting  in  cascading  alarms). 

The  log  thus  contains  partial  information  about  the  events 
that  took  place  at  the  approximate  time  of  the  system  failure. 
However,  the  order  in  which  messages  are  logged  does  not 
necessarily  reflect  the  way  error  messages  propagate  - the 
system  is  concurrent  and  safety  critical  actions  may  have  to 
be  taken  before  error  reporting  takes  place.  Hence,  in  what 


follows  we  (somewhat  conservatively)  view  the  log  as  a set  of 
error  messages.  In  addition  a system  may  contain  a number  of 
critical  events  that  are  unobservable,  but  which  may  explain 
all  observable  alarms. 

The  ultimate  aim  of  our  fault  isolation  method  is  to  single 
out  the  error  message  that  explains  the  actual  cause  of  the  fail- 
ure, or  possibly  an  unobservable  critical  event  explaining  the 
observations.  That  is,  we  aim  to  discard  error  messages  which 
are  definitely  effects  of  other  error  messages,  while  trying  to 
isolate  error  messages  (or  critical  events)  which  explain  all 
other  messages.  In  contrast  to  message  filtering,  we  can  thus 
find  failing  components  that  have  not  manifested  themselves 
in  the  error  log,  if  the  failing  of  the  component  is  a logical 
consequence  of  the  model  and  the  observations.  Given  the 
size  of  the  software  it  is  not  possible  to  use  the  code  directly 
- we  have  to  rely  on  a model  of  the  software.  In  this  paper  we 
consider  finite  state  machine  models  expressed  in  a process 
algebra.  The  process  algebra  is  chosen  here  because  it  allows 
for  more  straightforward  formal  reasoning  than  for  example 
state  charts,  but  the  contribution  of  this  work  - the  fault  iso- 
lation - relies  only  on  the  labeled  transition  system  semantics 
of  the  model.  In  practice,  the  aim  is  to  use  a behavioral  model 
that  is  an  artifact  of  the  software  development  process,  such 
as  state  charts.  Then  there  is  no  extra  cost  associated  with 
maintaining  a correct  model  when  the  software  evolves,  since 
then  so  does  the  model. 

In  standard  AI  diagnosis  literature,  see  e.g.  [Reiter,  1987], 
a diagnosis  is  a (minimal)  set  of  failed  components  explain- 
ing the  observations.  But  for  dynamic  systems  (systems  with 
state)  a diagnosis  is  often  defined  as  the  set  of  all  traces,  or  tra- 
jectories, consistent  with  the  observations  (see  e.g.  [Cordier 
etal,  2001;  Console  etal,  2000]).  However,  this  definition  is 
generally  insufficient  to  isolate  the  origin  of  the  fault(s),  and 
requires  post-processing  to  pin-point  e.g.  the  faulty  compo- 
nents). Our  approach  is  more  direct  and  focuses  on  finding 
the  alarm  that  explains  (is  consistent  with)  all  observables: 
given  the  system  description,  expressed  in  a simple  process 
algebra,  and  the  observations,  we  try  to  infer  the  origin  of  the 
fault  using  properties  of  actions  involving  the  temporal  or- 
der, expressed  in  a specification  language  based  on  a subset 
of  the  temporal  logic  CTL,  originally  developed  for  verifi- 
cation [Clarke  et  al,  1999].  This  resembles  the  process  of 
model  checking  and  as  in  the  case  of  model-checking  there 
is  no  need  for  calculation  of  the  entire  state  space  (obviously 


equivalent  to  the  set  of  traces  consistent  with  model  and  ob- 
servations) if  the  temporal  logic  formulae  are  evaluated  by 
constructing  the  state  space  on-the-fly. 

Our  approach  also  bears  some  resemblance  to  that  of  Sam- 
path  et  al.  [Sampath  et  al,  1995].  However  their  work  is 
mainly  concerned  with  diagnosability  in  discrete  event  sys- 
tems; i.e.  to  detect,  within  finite  delay,  whether  a certain  type 
of  fault  has  occurred.  While  our  approach  is  amenable  only  to 
post-mortem  analysis,  the  work  reported  in  [Sampath  et  al., 
1995]  is  mainly  intended  for  monitoring  and  on-line  detection 
and  diagnosis. 

The  rest  of  the  paper  is  organized  as  follows:  In  Section 
2 we  describe  the  behavior  language  that  will  be  used  to  de- 
fine a transition  relation,  that  defines  the  set  of  all  possible 
behaviors  (i.e.  traces).  In  Section  3 we  provide  rules  for  en- 
tailment  of  some  predicates  of  interest  from  configurations 
and  the  traces  that  can  follow  from  them.  Finally,  we  outline 
ongoing  and  future  work  in  Section  4. 

2 A behavior  language 

A behavior  model  can  be  expressed  in  different  ways,  and  we 
have  chosen  to  use  a process  algebra.  No  matter  which  for- 
malism and  notation  that  is  used,  the  semantics  should  pro- 
vide a labeled  transition  relation  that  describes  the  state  tran- 
sitions of  the  modeled  system.  In  this  section  we  describe  a 
process  algebra  influenced  by  CCS  [Milner,  1989]  and  give 
the  necessary  semantics. 

2.1  Processes 

Our  process  language  is  constructed  from  the  following  syn- 
tactic categories 

• a finite  set  C of  action  labels  denoted  by  a in  our  meta 
language.  Every  action  label  is  equipped  with  an  associ- 
ated arity  n>0. 

• a set  0 of  object  id’s  denoted  by  o. 

• a finite  set  S of  states  A with  associated  arity  n > 0. 

We  consider  four  types  of  actions  (denoted  by  a in  our  meta 
language). 

• Send  actions  of  the  form  o:a(t),  where  o is  the  recipient 
object,  a an  n-ary  action  label  and  t is  an  n-tuple  of 
object  id’s  or  variables. 

• Receive  actions  of  the  form  a(x)  where  a is  an  n-ary 
action  label  and  x is  an  n-tuple  of  variables. 

• Internal  actions  of  the  form  a,  where  a is  a nullary  action 
label. 

• New-actions  of  the  form  new{o,  P)  where  o E O and  P 
is  a process  expression,  defined  below. 

A process  is  described  by  a process  expression,  denoted  by 
P (and  occasionally  Q),  and  given  by  the  following  abstract 
syntax 

P::=  A(t)  l^a^.P 

iei 

where  / is  a finite  index  set.  Sums  are  usually  written  sim- 
ply ai.Pi  -h  a-i-Pi-  We  reserve  the  nullary  state  Stop  for  a 


completed  process.  We  assume  that  every  A[n  E S (Stop 
excepted)  has  a defining  equation  of  the  form 


A process  state  c is  a partial  map  from  O to  V.  The  object 
init  E O is  called  the  initializing  object,  the  state  Main  E 
S is  called  the  main  process  and  the  state  ctq  :=  {init  ^ 
Main}  is  called  the  initial  process  state. 

Let  a:0  -)•  P be  a process  state,  o E O and  P E V. 
By  a[o  (-)•  P]  we  denote  the  process  state  which  is  almost 
identical  to  a except  possibly  at  o.  That  is 

cr[a^P](:r):=| 

The  behaviors  of  our  system  are  described  by  the  labeled  tran- 
sition rules  in  Figure  1 . Our  transitions  are  of  the  form 


where  a (the  observation)  is  a set  of  pairs  of  the  form  (o,  a) 
representing  action  a occurring  in  object  o. 

There  are  four  transition  rules,  sync,  internal,  new  and  def. 
The  rule  sync  allows  two  objects  to  synchronize  their  state 
transitions  and  optionally  exchange  values.  In  our  limited  al- 
gebra, the  only  values  that  can  be  transmitted  are  object  iden- 
tifiers. However,  the  idea  is  not  to  model  all  system  behavior, 
but  to  have  a system  model  that  reveals  synchronization  and 
system  structure.  The  rule  internal  allows  a single  object  to 
perform  a transition  by  itself  Creation  of  new  objects  is  han- 
dled by  the  rule  new,  and  def  allows  for  exchanging  a state 
with  its  definition. ' 

Example 

Typically,  a system  is  described  by  creating  a main  process 
that  sets  up  the  system  structure.  Figure  2 shows  an  example 
of  such  a system.  Process  Main  creates  three  objects  and 
mns  Setup  which  tells  the  objects  about  each  other  via  the 
init  call.  This  is  needed  since  when  started,  a process  does 
not  know  anything  about  its  environment.  After  init,  each 
object  will  act  as  a peer-to-peer  node,  as  showed  in  Figures  3 
(the  system)  and  4 (object  details).  Objects  can  send  requests 
to  each  other,  and  sometimes  the  answer  to  a request  is  a fail- 
ure, and  then  the  system  is  brought  to  a halt  by  transmission 
of  down  messages. 

3 Fault  Isolation 

The  available  information  when  doing  fault  isolation  is  a sys- 
tem model  and  an  observation  (in  our  case  a message  log). 
We  use  the  term  scenario  to  refer  to  that  information.  In  the 
following  we  overload  the  term  action  in  the  context  of  sce- 
narios to  mean  pairs  (o,  a)  € O x C where  o is  an  object 
identifier  and  a is  an  action  label.  Some  of  the  actions  in  a 
system  are  critical  actions,  actions  that  are  associated  with 
system  failures. 

Thus  a scenario  is  a quadraple  ( — >,Crit,Logp,Logn), 
where  — > is  a process  state  transition  relation,  Crit  C Ox£ 

' Since  we  rely  on  a finite  state  space  model,  we  do  not  allow 
unbounded  creation  of  objects  via  the  new  rule. 


internal : 


a{oi)  = Pi-\-  a^P  + -P2 

a — >■  cr[Oi  r\ 


new  : 


a{oi)  = Pi  nem(oy  Q).P  + P2  <^[oi  ^ P] [o  ^ <?]  4 a' 


a — > a' 


a{0i)  = A{t)  A(x)  P a[oi  ^ P{x/t}]  4 cr' 


3. 


def : 


a — V a' 


Figure  1 : Process  transition  rules  (t  is  a vector  of  object  id’s) 


Servent{this,  x,  y) 

def 

x:req{this).Wait{this,  x,y)  + y:req{this).Wait{this,  x, 

req{o).Compute{this,  x,  y,  0)  + downQ.Down 

Wait{this,x,y) 

def 

ok{).Servent{this,  x,y)  + fail{).Fail{x,y) 

Compute{this,  x,  y,  0) 

def 

o:ok{).Servent{this,  x,  y)  + o:fail{).Servent{this,  x,  y) 

Fail{x,y) 

def 

x:down{).Fail{x,y)  + y:dawn{).Fail{x,  y) 

Down 

def 

Stop 

S 

def 

init{this,  x,  y).Servent{this,  x,  y) 

Main 

def 

new{si,  S).new{s2,  S),new{s2,  S).Setup{s\,  sz,  S3) 

Setup{x,y,z) 

def 

x:init{x,  y,  z).y:init{y,  z,  x).z:init{z,  x,  y).Stop 

Figure  2;  A process  algebra  example 


is  the  set  of  critical  actions,  Logp  C C>  x £ is  the  set  of 
actions  that  have  been  observed  (i.e.  the  message  log),  and 
LoQn  C C?  X £ is  the  set  of  actions  known  not  to  have  oc- 
curred (i.e.  the  observable  actions  not  contained  in  the  mes- 
sage log).  Thus,  we  assume  that  a synchronized  action  is 
logged  as  two  separate  actions  - one  from  the  sending  object 
and  one  from  the  receiving.  This  allows  modeling  of  mes- 
sage sending  with  unknown  receiver  and  is  no  severe  limi- 
tation since  it  is  possible  to  express  receiver  information  by 
having  a model  where  the  desired  action  labels  are  unique  and 
receiver  object  id  thus  becomes  unambiguous. 

A configuration,  denoted  C,  is  the  symbol  J.  or  a pair  (<t,  1) 
where  c is  a process  state  and  f C C?  x £ is  a set  of  actions. 
The  following  rules  defines  the  configuration  transition  rela- 
tion ^ for  a given  — > and  Logn- 

a a'  an  Logn  = 0 
{a,  1)  ^ {a',  Z U a) 

a a'  an  Logn  7^  0 

(«T,  1) 

The  configuration  {{init  Main},  0)  is  called  the  initial 
configuration.  The  configuration  L is  called  a forbidden  con- 
figuration and  represent  configurations  that  are  allowed  by 
the  behavioral  model,  but  inconsistent  with  the  observations 
at  hand.  We  see  configurations  as  snapshots  of  the  system 
state  of  a given  scenario,  and  the  configuration  transition  re- 
lation describes  the  behavior  of  the  system.  Fault  isolation  is 
the  process  of  finding  the  first  critical  action  that  has  occurred 
in  a given  scenario,  the  root  action.  Given  the  single  fault  as- 
sumption and  a system  model  that  is  properly  designed,  the 
first  critical  action  to  occur  in  the  system  is  the  cause  of  the 
failure. 

An  action  a is  present  in  a scenario  if  the  system  model 
and  the  observation  entails  the  occurrence  of  a.  An  action  a 
is  an  enabled  root  if  the  assumption  that  a is  root  action  is 
consistent  with  the  observations  and  the  system  model.  We 
introduce  the  concept  of  strong  root  candidate,  and  say  that  a 
strong  root  candidate  is  an  action  that  is  both  present  and  an 
enabled  root. 

3.1  Predicate  rules 

Given  a certain  scenario  (^,  Crit,  Logp,  Logn),  we  wish  to 
reason  about  properties  of  reachable  configurations.  There- 
fore we  define  predicates,  that  correspond  to  the  interesting 
properties,  by  determining  for  which  configurations  they  hold 
trae.  Since  we  are  interested  in  strong  root  candidates,  we 
need  to  formally  define  present  actions  and  enabled  root  ac- 
tions. Thus  we  define  the  predicate  present{a)  that  holds 
in  configurations  where  action  a must  occur  sometime  in  the 
future  and  the  predicate  enabledroot{a)  that  holds  for  con- 
figurations where  it  is  consistent  to  assume  that  a may  be  the 
first  critical  action  to  occur.  In  defining  these  two  predicates, 
we  will  need  some  helper  predicates.  We  will  use  okend  that 
holds  in  configurations  that  correspond  to  consistent  ending 
states  of  the  system.  An  ending  state  is  a state  where  no  more 
observable  actions  occur,  i.e.  when  the  system  has  reached  a 
final  state.  In  a configuration  where  a has  occurred,  seen{a) 


holds,  while  nocrit  holds  in  configurations  where  no  critical 
action  has  occurred.  The  predicate  end  holds  in  configura- 
tions where  there  is  no  next  configuration. 

We  define  entailment  of  logical  formulae  from  the  follow- 
ing syntax: 

J^::=TVT\TAT\^T\  EF{T)  \ EX{E)  \ AG{E)  \ 

end  I okend  \ nocrit  \ 
seen{a)  | present{a)  \ enabledroot{a) 

In  order  to  be  able  to  define  entailment  for  the  desired  predi- 
cates, we  will  need  the  following.  We  use  for  the  reflexive 
transitive  closure  of  =>.  First  we  define  entailment  for  basic 
connectives. 

C|=F,  C\=F2  C\=F2 

C \=  El  A F2  C \=  El  VF2 
C\=  El  CY=F 
C \=  El  VF2  C \=  -.F 

We  will  be  reasoning  about  temporal  order,  so  we  need  to 
define  temporal  logic  operators. 

C^C  C'\=F 
C \=  EF{F) 

C^C  C'\=F 
C 1=  EX(F) 

C 1=  F whenever  C 4 C" 

C 4 AG{F) 

We  also  need  entailment  for  a few  helper  predicates.  The 
predicate  end  determines  if  a configuration  lacks  successor 
(i.e.  end  = -iFX {true)  where  true  is  entailed  by  every  con- 
figuration), seen(a)  is  true  when  an  action  a has  occurred 
and  nocrit  holds  when  no  critical  actions  have  yet  occurred. 
^3C',C^G'  ael  Vael,a^  Crit 

C 1=  end  {a,  1)  |=  seen{a)  {a,  1)  |=  nocrit 

Now  we  have  the  tools  needed  to  define  the  desired  pred- 
icates. If  we  have  reached  a configuration  from  which  the 
system  cannot  continue  to  execute  and  all  actions  in  Logp  are 
seen,  then  the  configuration  is  an  okend,  unless  the  configura- 
tion is  a forbidden  configuration.  It  is  thus  one  of  the  possible 
halting  configurations,  given  the  scenario  at  hand. 

Va  G Logp,  C |=  seen{a)  C |=  end  C 
C 1=  okend 

If  it  is  true  for  all  reachable  configurations  that  whenever 
we  have  reached  an  okend,  we  have  seen  action  a,  we  con- 
clude that  the  presence  of  a is  entailed  from  observations  and 
system  model. 

C 1=  AG{-^okend\/  seen{a)) 

C 1=  present  (a) 

If  there  is  a reachable  configuration  Ci  such  that  no  critical 
actions  has  taken  place,  and  there  is  a configuration  step  that 
takes  us  from  Ci  to  C2  where  the  critical  action  a has  oc- 
curred, we  conclude  that  a is  an  enabled  root  if  it  is  possible 
to  reach  an  okend  from  C2 . 

a e Crit  C \=  EF{nocrit  A EX(seen{a)  A EF(okend))) 
C 1=  enabledroot{a) 


3.2  Reasoning  about  behavior 

Given  a scenario,  the  strong  root  candidates  are  the  actions  a 
for  which 

{{init  Main},$)  |=  present{a)  A enabledroot{a) 

If  we  have  no  strong  root  candidates  or  more  than  one  strong 
root  candidate,  the  system  model  is  not  strong  enough  for  ef- 
ficient fault  isolation.  If,  on  the  other  hand,  we  have  exactly 
one  strong  root  candidate,  we  assume  that  we  have  pinpointed 
the  true  cause  of  the  fault.  This  is  reasonable  to  assume,  since 
the  action  found  is  the  only  one  that  is  known  to  have  occurred 
(its  presence  is  entailed  by  the  scenario)  and  it  is  consistent 
with  the  given  scenario  to  assume  that  the  action  is  a root 
event. 

Of  course  there  is  still  a possibility  that  there  are  other  en- 
abled root  events  whose  presence  are  consistent  with  the  sce- 
nario, but  assuming  one  of  them  to  be  root  would  demand  an 
explanation  to  why  the  strong  root  candidate  (proven  to  be 
present!)  is  not  the  root. 

3.3  Prototype  implementation 

We  have  designed  a prototype  XSB  [Sagonas  et  al.,  1994] 
program  that  takes  a system  model  and  observations  as  input 
and  enumerates  the  strong  root  candidates.  XSB  is  a Prolog 
dialects  using  tabulation  (memoization)  to  improve  termina- 
tion. Given  the  system  model  in  Figure  2 and  facts  stating 
that  any  sending  of  fail  or  down  indicates  system  failure, 
i.e.  those  actions  are  critical  actions,  and  the  observations 
that  {o2,fail)  has  not  occurred  and  [oz,fail)  has  occurred, 
the  XSB  Prolog  program  computed  {pi,fail)  to  be  the  single 
strong  root  candidate. 

The  system  consists  of  three  objects  that  all  execute  the 
same  process.  See  Figure  4 for  an  automata  representation 
of  a similar  process  (parameters  are  not  explicit  in  the  au- 
tomata). Consider  the  critical  actions.  Obviously,  no  down 
message  can  be  root  action  since  it  will  always  be  preceded 
by  a fail  action,  and  neither  can  (02 , fail)  be  root  action  since 
it  is  known  to  not  have  occurred  at  all.  This  leaves  us  with 
(oi , fail)  and  (03 , fail).  It  is  consistent  with  the  system  model 
and  the  observations  to  assume  that  {oz,fail)  is  the  root  ac- 
tion, since  if  02  receives  the  fail  from  03,  then  oi  can  send 
fail  to  03  afrerwards.  We  cannot  prove  that  (oz,fail)  has  hap- 
pened, however.  This  can  be  done  for  (oi,fail),  and  therefore 
it  is  the  only  action  that  is  both  enabled  root  and  present. 

Thus,  having  some  intuition  of  the  system  makes  the  fault 
isolation  described  above  almost  trivial,  but  the  key  motiva- 
tion of  this  work  is  to  formalize  and  automate  this  intuition. 

4 Future  Work 

In  previous  work  with  Larsson  [Larsson  et  al,  2000;  Larsson, 
1 999]  we  studied  the  fault  isolation  problem  using  a structural 
model.  A key  feature  of  that  approach  is  the  use  of  software 
engineering  models,  in  particular  UML  [Object  Management 
Group,  1999]  class  diagrams.  Such  a model  can  be  devel- 
oped and  maintained  at  a relatively  low  cost  being  an  inte- 
grated part  of  the  software  development  process.  The  work 
presented  here  and  in  our  previous  work  [Lawesson,  2000; 


Lawesson  et  a/. , 200 1 ] aims  to  strengthen  the  diagnostic  capa- 
bility while  still  using  standard  and  state-of-the-art  modeling 
notations.  Behavior  in  UML  is  often  expressed  using  state- 
charts,  and  process  algebras  provide  a textual  representation 
of  state  machines.  Of  course,  enforcing  the  software  devel- 
oper to  construct  complete  statecharts  for  all  classes  is  not 
realistic  in  large  software  systems;  hence,  reasoning  must  be 
able  to  cope  with  incomplete  or  missing  behavioral  descrip- 
tions. Our  approach  should  also  be  extended  to  deal  with  the 
special  features  characteristic  of  object  oriented  software  sys- 
tems such  as  classes  and  inheritance.  Below  we  sketch  some 
partial  solutions  to  such  issues,  which  will  be  addressed  in 
our  future  work. 

4.1  Classes  behaviors  and  inheritance 

Our  process  algebra  expresses  a system  model  as  a flat  set 
of  the  process  defining  equations  without  any  hierarchy.  In 
an  object  oriented  design,  the  system  behavior  is  partitioned 
into  classes.  Furthermore,  inheritance  allows  for  a hierarchy 
of  classes.  We  implement  simple  schemas  called  classes  in 
order  to  achieve  the  partitioning  and  (inheritance)  hierarchy. 

Thus,  in  the  following  a class  is  a scheme  that  can  be  com- 
piled to  a set  of  process  defining  equations.  A class  C may 
inherit  parts  of  its  characteristics  (e.g.  its  behavior)  from  a su- 
perclass, and  in  that  context  C is  referred  to  as  the  subclass. 
A state  inheritance  sequence 

S !->•  [Ai,A2,...,An] 

is  a declaration  saying  that  state  S in  the  superclass  is  refined 
by  states  A\,A2, in  the  subclass  where  A\  is  the  de- 
fault state  (i.e.  the  substate  entered  when  entering  the  super- 
state S).  When  compiling  the  class  to  process  equations,  the 
inheritance  sequence  describes  how  the  defining  equations 
from  the  superclass  should  be  used.  Thus,  we  implement  a 
simple  form  of  inheritance  as  refinement.  The  syntax  used 
for  defining  classes  below  is 

N = {S,I),D 

where  N is  the  name  of  the  class,  S is  the  name  of  the  super- 
class (if  any),  I is  the  set  of  state  inheritance  sequences  and  D 
is  a set  of  process  defining  equations.  If  there  is  no  superclass 
we  write  N = (),  D. 

Example 

Lacking  formal  tools,  we  outline  the  approach  by  an  example. 
In  the  following  we  define  two  classes  Ci  and  C2,  where  C2 
refines  the  state  A in  Ci  with  states  C and  D.  We  say  that 
states  C and  D refine  state  A. 

Ci  = 0,{ 

A b.B 

B ^ a.A 

} 

C2  = (Ci,{A^IC,BJ}),{ 

C ='^  d.D 

D c.C 

B e.D 

} 


Now,  C2  may  be  compiled  to  the  following  process  equa- 
tions. 

C'2:C' 

dej 

b.C2:B  + d.C2:D 

C^-.B 

def 

a.C2:C  + e.C2:D 

C2-.D 

def 

b.C2:B  + c.C2:C 

The  outgoing  transitions  from  A become  outgoing  transi- 
tions from  all  refining  states,  while  the  incoming  transitions 
are  moved  from  the  refined  state  to  the  first  of  the  refining 
states.  If  there  are  transitions  from  the  same  state  in  both 
super-  and  subclass,  they  are  joined  as  indeterministic  choice, 
as  with  state  B and  transitions  a. A and  e.£>.  The  states  are 
prefixed  with  the  class  name  to  avoid  name  space  clashes. 

4.2  Statecharts 

Since  both  processes  and  statecharts  have  a transition  system 
semantics,  the  mapping  is  straightforward  once  the  semantics 
of  the  statecharts  is  fixed.  We  use  a handshaking  semantics 
of  the  statecharts,  because  of  expressivity  and  domain  proper- 
ties as  described  in  [Lawesson,  2000].  We  define  the  seman- 
tics via  our  process  language  by  providing  a mapping  from 
statecharts  to  processes.  The  mapping  is  rather  straightfor- 
ward since  we  restrict  ourselves  to  statecharts  without  history 
states  - essentially  making  the  state  chart  equivalent  to  an 
automata  without  hierarchy,  see  for  example  [Lilius  and  For- 
res, 1999].  The  process  algebra  example  in  Figure  2 could 
represent  a slightly  improved  version^  of  the  automata  in  Fig- 
ure 4 with  structure  information  (i.e.  the  states  5,  Main  and 
Setup)  added. 

4.3  Default  behaviors  of  class  diagrams 

Since  a class  diagram  in  general  does  not  contain  behavioral 
information  in  terms  of  statecharts,  we  may  introduce  a su- 
perclass called  Propagator  that  encapsulates  the  behavior  of 
being  able  to  propagate  errors  as  well  as  reporting  errors  to 
the  log,  and  a subclass  Breakable  that  is  a propagator  that  can 
introduce  errors  by  the  transition  crit.  The  idea  is  to  let  all 
classes  inherit  from  Propagator,  and  then  refine  with  behav- 
ioral models  when  available,  and  use  Breakable  for  classes 
that  may  give  rise  to  critical  actions  but  where  a behavioral 
model  is  missing.  The  definition  of  Propagator  and  Break- 
able are  given  in  Figure  5. 

The  paths  of  error  propagation  between  classes  is  com- 
puted by  using  information  about  dependencies  between 
classes  in  the  class  diagrams  (as  in  [Larsson,  1999;  Larsson 
et  al,  2000]),  and  then  reflected  in  the  Failed{x.)  state  that 
models  error  propagation. 
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Figure  3:  A global  picture  of  the  example  system  consisting  of  the  objects  o\,  02  and  03.  Each  object  has  behavior  as  described 
in  Figure  4. 


down! 


Figure  4:  An  automata  describing  a peer-to-peer  system.  Sending  actions  are  suffixed  with  ! and  the  rest  of  the  actions  are 
receiving  actions.  There  are  no  internal  actions  in  this  automata. 


PropagatoTn  = (),  { 
Main 
OK{x) 

Failing{x) 
Failed{xi,X2, Xn) 

} 


‘^=  fail{).Failing{x.) 

dcf 

= log.Failedix)  -F  nolog. Failed{x) 

dcf 

= xi:fail{).Failed{x)  + ...  + Xn:fail{).Failed{x) 


Breakable  = {Propagator,  {}),  { 

OK{x)  crit.Failing{x) 

} 


Figure  5:  Definitions  of  the  classes  Propagator  and  Breakable 


